Goto

Collaborating Authors

 text encoder



Supplementary Material for P-Flow

Neural Information Processing Systems

The link to our demo page is https://bit.ly/3ID5Zam. We present the objective metrics according to the Euler steps in the result section of the main paper. We measure the acoustic quality using 5-scale Mean Opinion Scores (MOS).


P-Flow: A Fast and Data-Efficient Zero-Shot TTS through Speech Prompting Sungwon Kim 1,2, Kevin J Shih

Neural Information Processing Systems

Our work proposes P-Flow, a fast and data-efficient zero-shot TTS model that uses speech prompts for speaker adaptation. P-Flow comprises a speech-prompted text encoder for speaker adaptation and a flow matching generative decoder for high-quality and fast speech synthesis.